Sorcerer II Global Ocean Sampling
The ''Sorcerer II ''Global Ocean Sampling (GOS) expedition collected surface water samples across a several- thousand kilometer transect starting in The North Atlantic and ending in the South Pacific, from August, 2003 to May, 2004. The marine planktonic microbiotica contained in these samples was the subject of a massive metagenomic study conducted by researchers at the J. Craig Venter Institute, in which 7.7 million DNA sequence reads provide new insight into the diversity of oceanic microbial life. The results of this study were first reported in the journal PLOS Biology in 2006. Data Collection Most of biochemical and genetic studies to date have been limited to organisms that are abundant and readily cultivatable, leaving an estimated 99% of the total organisms on this planet largely uncharacterized. Thus, a central objective of the GOS study was to begin to asses the genetic diversity of microbial sea life. A total of 44 samples were collected at 41 aquatic locations (mostly marine) by scientists aboard the modified yacht, Sorcerer II, over 8,000 km. Additional environmental conditions were also recorded for each site, including temperature. pH and salinity. Samples were collected at approximately 320-km intervals. Each sample was separated into multiple size fractions, and total DNA was extracted primarily from the 0.1-0.8 micron fraction which is enriched for bacteria. Sequencing methods: Shotgun sequencing was conducted on the collected samples using random-insert clone libraries. Collected DNA was transformed into E. coli for gel purification and treatment. A di-deoxy sequencing method, using Big Dye Terminator chemistry and M13 primers, was used on the purified DNA. PCR reaction products were loaded into a AB3730xl DNA analyzer. In order to gain insight into the significance of the sequence data in a larger genomic context, a Celera Assembler was used on the GOS dataset. Analysis: All sequences were combined in an all-against-all comparison using the Celera Assembler. Alignments were constructed using a minimum of 14 identical bp as a cutoff. The assembly algorithm consisted of three stages – overlap, layout and consensus. Sequence phylogenies were determined by aligning segments to a reference sequence and a refined multiple sequence alignment was generated using the MUSCLE technique. Results: Classifications of genetic diversity have commonly been determined using rRNA-based analysis, such as the sequence identity of the 16S subunit. However there are considerable limitations to this method, namely that it is less able to predict biochemical variations or clearly establish a connection between biochemical and genetic diversity. Indeed, some species, such as E. coli, have been shown to have substantial variations in genome content between different strains, despite having extremely similar ribotypes. The unprecedented scope of this project – and the implementation of several novel analysis techniques – has provided the authors of this study the ability to improve considerably on these previous limitations. This metagenomic study generated a total of over 7.7 million sequence reads, identifying greater than 1.2 million new genes. 85% of the assembled sequences were determined to be unique as defined by a 98% sequence identity cutoff, indicating the breadth of their sampling. The analysis indicates that nucleotide and protein sequences are less rigidly conserved than gene synteny. Furthermore, the authors demonstrated that there are defined subtypes of genetic divergence within species rather than “an unstructured swarm or cloud of variants.” In sum, the project revealed the tremendous taxonomic diversity of microbial life within our marine systems, providing valuable insight into the mechanisms in which these bacteria evolve, and providing a wealth of data for many future studies. References: 1. Rusch DB, et al. (2007) The Sorcerer II Global Ocean Sampling expedition: northwest Atlantic through eastern tropical Pacific. PLoS biology 5(3):e77.